Semantics-Based Reference Resolution in Technical Text Processing: An Exploration of Using the WordNet Database in the Computerized Comprehensibility System
نویسنده
چکیده
The Computerized Comprehensibility System (CCS) provides an automated copy editing function, generating a "mark-up" of a draft of a technical document by simulating the simpler comprehension processes of a human reader, and then criticizing the text when these simple processes cannot successfully comprehend the material. A key CCS function is criticizing the coherence of the material by tracking which objects are mentioned in the passage. A common comprehensibility problem is that the text mentions a new object using the syntactic structures appropriate for an already-known object. If the reader must make an inference that presence of the new object is implied by earlier-mentioned object, the result is a potential break in the coherence of the text. CCS criticizes all such coherence breaks. However, many such inferences are actually easy for most readers, since only general knowledge is required to make the inference, rather than specialized knowledge about the domain. If so, then the CCS criticism of a coherence break is a false alarm. This report describes exploratory work with an augmented form of CCS, in which the WordNet database is used as a source of general knowledge to allow CCS to make the same kind of general knowledge inferences that human readers do to overcome coherence breaks.
منابع مشابه
Corpus based coreference resolution for Farsi text
"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...
متن کاملAutomatic Construction of Persian ICT WordNet using Princeton WordNet
WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...
متن کاملCorefrence resolution with deep learning in the Persian Labnguage
Coreference resolution is an advanced issue in natural language processing. Nowadays, due to the extension of social networks, TV channels, news agencies, the Internet, etc. in human life, reading all the contents, analyzing them, and finding a relation between them require time and cost. In the present era, text analysis is performed using various natural language processing techniques, one ...
متن کاملLexical Discovery with an Enriched Semantic Network
The study of lexical semantics has produced a systematic analysis of relationships between content words that has greatly bene ted both lexical search tools and natural language processing systems. We describe research toward a common algorithmic core for these two applications. We rst introduce a database system called FreeNet that facilitates the description and exploration nite binary relati...
متن کاملDesigning and implementing a system for Automatic recognition of Persian letters by Lip-reading using image processing methods
For many years, speech has been the most natural and efficient means of information exchange for human beings. With the advancement of technology and the prevalence of computer usage, the design and production of speech recognition systems have been considered by researchers. Among this, lip-reading techniques encountered with many challenges for speech recognition, that one of the challenges b...
متن کامل